Telomere-to-telomere gap-free genome assembly of the endangered Yangtze finless porpoise and East Asian finless porpoise

Abstract Background The Yangtze finless porpoise (Neophocaena asiaeorientalis asiaeorientalis, YFP) and the East Asian finless porpoise (Neophocaena asiaeorientalis sunameri, EFP) are 2 subspecies of the narrow-ridged finless porpoise that live in freshwater and saltwater, respectively. The main objective of this study was to provide contiguous chromosome-level genome assemblies for YFP and EFP. Results Here, we generated and upgraded the genomes of YFP and EFP at the telomere-to-telomere level through the integration of PacBio HiFi long reads, ultra-long ONT reads, and Hi-C sequencing data with a total size of 2.48 Gb and 2.50 Gb, respectively. The scaffold N50 of 2 genomes was 125.12 Mb (YFP) and 128 Mb (EFP) with 1 contig for 1 chromosome. The telomere repeat and centromere position were clearly identified in both YFP and EFP genomes. In total, 5,480 newfound genes were detected in the YFP genome, including 56 genes located in the newly identified centromere regions. Additionally, synteny blocks, structural similarities, phylogenetic relationships, gene family expansion, and inference of selection were studied in connection with the genomes of other related mammals. Conclusions Our research findings provide evidence for the gradual adaptation of EFP in a marine environment and the potential sensitivity of YFP to genetic damage. Compared to the 34 cetacean genomes sourced from public databases, the 2 new assemblies demonstrate superior continuity with the longest contig N50 and scaffold N50 values, as well as the lowest number of contigs. The improvement of telomere-to-telomere gap-free reference genome resources supports conservation genetics and population management for finless porpoises.


Introduction
Finless porpoises (Neophocaena spp.) are uniquely small toothed whales capable of inhabiting freshwater (Yangtze River) and saltwater (coastal waters of southern and eastern Asia) environments [1,2].They are characterized by a blunt, rounded head, an equal width upper and lower jaw, and lack of a clearly dorsal fin [3,4].Based on morphological characteristics, geographic distribution and molecular genetic evidence, it is generally believed that the finless porpoise can be divided into two species, namely the Indo-Pacific finless porpoise (N.phocaenoides) and the narrow-ridged finless porpoise (N.asiaeorientalis).In China, there exist two subspecies of the narrow-ridged finless porpoise: one is the freshwater population (Yangtze finless porpoise, N. a. asiaeorientalis), which exclusively inhabits the middle and lower reaches of the Yangtze River and adjacent Dongting and Poyang lakes; while the other is the marine population (East Asian finless porpoise, N. a. sunameri), which occurs in the coastal waters of the Yellow Sea and Bohai Sea, as well as the northern waters of the East China Sea [5,6] (Figure 1A).
The investigation of the evolutionary origins and conservation genetics of finless porpoises is an urgent priority for scientists.Yang et al. identified significant genetic structure between the Indo-Pacific finless porpoise and the other two populations by analyzing the sequences of mtDNA control region of finless porpoises in Chinese waters [7].This result was supported by subsequent mtDNA sequences, nuclear DNA microsatellites, single nucleotide polymorphisms (SNPs) and MHC loci [8][9][10][11][12].The genetic diversity of East Asian finless porpoises surpasses that of the other two populations, indicating it as the likely center of origin for this species.Zheng et al. analyzed the sequences of the mtDNA control region of seven local populations of Yangtze finless porpoises in the middle and lower reaches of the Yangtze River, and found that the overall level of genetic diversity was low.Notably, the downstream population showed richer genetic variation than the midstream population.Such a genetic pattern reflects, to some extent, the marine origin and evolutionary history of the Yangtze population [13].Based on genomic analysis of finless porpoise populations, significant genetic structure was identified among the three populations, indicating local adaptive evolution and emphasizing the evolutionary distinctiveness and conservation significance of the Yangtze finless porpoise [14].
Availability of high-quality genome assembly is not only critical for the genomic studies of finless porpoises, but also would be a valuable resource for comparative genomics and evolutionary studies of cetacean.The first draft of the YFP genome assembly was published in 2018 with a size of 2.3 Gb, which was generated by short-read sequencing on the Illumina HiSeq 2000 platform [14].
However, this draft was highly fragmented and consisted of 104 scaffolds with an N50 of 6.3 Mb.
Although progress had been made in genome-wide studies of YFP through the availability of this draft, such as immune changes with age and gene expression profiles in different habitats [15,16], the lack of chromosomal information had led to some limitations in genomic studies of YFP.Recent advances in ultra-long ONT and PacBio HiFi sequencing technologies, as well as assembly algorithms, have facilitated the development of Telomere-to-telomere (T2T) genome assemblies.
The completion of the T2T human genome sequence and the full Y chromosome sequence represents a significant milestone in the field of human genomics research, offering great potential for comprehensive genomic analysis in evolutionary studies [17,18].T2T genome has emerged as a hotspot genomic research fields now, extensive applications to other animal species such as chicken, fish [19,20].T2T genome assemblies can serve as a benchmark with enhanced accuracy and comprehensive genomic references for future studies, facilitating the confident identification and annotation of genes, regulatory elements, and other functional components.
High-quality genome can support finer genetic analyses, such as the length, number, and distribution of key indicators of inbreeding, such as ROH and IBD [21,22], and these analyses are more urgent than ever for the conservation of endangered species.In this study, we utilized PacBio HiFi, Nanopore and Hi-C data to generate two improved telomere-to-telomere gap-free genomes of Yangtze and East Asian finless porpoises.We compared the quality of newly drafted assemblies with previously available versions and explored the synteny blocks, comparative genomes, phylogenetic relationships, gene family expansion and selection pressure in relation to the several mammals.Finless porpoises serve as a representative example for comprehending speciation, evolution, and population genetics.The high-quality chromosomal-level references facilitate the elucidation of adaptation mechanisms in aquatic mammals.

Genome sequencing and gap-free assembly
We integrated PacBio HiFi long reads, ultra-long ONT reads and Hi-C sequencing data to generate chromosome-level genome assemblies for YFP and EFP.We generated approximately 123 Gb (49x) PacBio HiFi reads, 279 Gb (111x) Hi-C reads and 225 Gb (90x) ONT reads for YFP (Supplementary Table S1).Based on the previously sequenced 62x PacBio HiFi and 85x Hi-C reads of the EFP [15], we generated 215 Gb (86x) ONT reads in this study (Supplementary Table S1).
The genome assembly of the YFP comprised 23 scaffolds, with both contig N50 and scaffold N50 measuring 125.12 Mb (Table 1).These scaffolds were assembled into 21 autosomal chromosomes, one X chromosome, and one mitochondrial chromosome, resulting in a final assembly size of 2.48 Gb (Supplementary Table S2).Similarly, the genome assembly for EFP consisted of 24 contigs or scaffolds, with both contig N50 and scaffold N50 measuring 128.00 Mb (Table 1).These scaffolds were also assembled into 21 autosomal chromosomes, one X+Y chromosome, and one mitochondrial chromosome, with a final assembly size of 2.50 Gb (Supplementary Table S2).
Compared to the YFP v1.0 (GCF_000442215) and EFP v1.0 (GCA_026225855) assemblies, we have significantly enhanced the contiguity, accuracy and completeness of these two genome assemblies.The contig N50 values of the two genomes were consistent with their respective chromosome lengths, and a single contig represented a complete chromosome, which is notably superior to the recently published finless porpoise genomes (e.g., 125.12 Mb vs. 0.09 Mb for YFP and 128.00 Mb vs 84.69 Mb for EFP) (Table 1).The YFP v1.0 and EFP v1.0 genome assemblies had 52,647 and 28 gaps, respectively, whereas in the new assembly we filled all the gaps and obtained a gap-free genome, greatly improving the contiguity of the assembled sequences (Figure 2A and Figure 2B).The Merqury estimated quality values of 60.18 and 64.38 based on k-mer analysis of YFP and EFP, respectively, which indicated that our assemblies were of high quality (Supplementary Table S3).Moreover, the mapping rates of RNA reads to the two genome assemblies were 95.47% and 95.42%, respectively, whereas they were 70.01%and 92.34%, correspondingly, for previously published assemblies (Supplementary Table S4).Using Benchmarking Universal Single-Copy Orthologs (BUSCO) evaluation, we achieved 95.20% completeness of YFP and 95.30% completeness of EFP (Table 1 and Figure 1E).Additionally, we have identified telomeric repeat units and centromere region in YFP and EFP (Figure 2A, Figure 2B and Supplementary Table S5-S6.Telomeric repeat units of YFP genome were detected at both ends of 18 chromosomes and at one end of 3 chromosomes.Similarly, telomeric repeat units of EFP genome were detected at both ends of 20 chromosomes and at one end of 2 chromosomes.Notably, the centromere regions of each chromosome were predicted in the newly assembled YFP and EFP genomes, whereas the associated centromere sequences were not predicted in the first draft of YFP. The new genome assemblies of YFP and EFP were well-assembled without any gaps, achieving nearly telomere-to-telomere (T2T) completeness.Finally, we also utilized Hi-C data for chromosome sequencing and orientation (Figure 1C and Figure 1D).

Gene prediction and annotation
Two strategies including de novo and homolog-base methods were applied to annotate repeat elements.The genomes of YFP and EFP contained 1,058.09Mb (42.54%) and 1,069.10Mb (42.80%) of repetitive sequences, respectively (Supplementary Figure S1-S2 and Table S7).Long interspersed nuclear elements (LINEs) were the most abundant type of annotated transposable elements, constituting 38.88% and 39.10% of the genomes of YFP and EFP, respectively (Supplementary Table S8).For gene content assessment, 8 homologous proteins and 24 RNA-seq data were used (Supplementary Table S9-S10).In total, we predicted 23,139 and 23,101 proteincoding genes in the YFP and EFP genomes, respectively (Table 1), where the average length of coding sequence (CDS) was 1,507 bp and 1,510 bp, respectively.The average length of exon was both 175 bp, and the average length of intron was 6,082 and 6,107 bp, respectively (Supplementary Table S11-S12).The protein-coding genes in the YFP and EFP genomes were supported by at least one evidence with a CDS overlap ratio greater than 80% at a level of 99.96% and 99.95%, respectively (Supplementary Table S13-S14).It was worth noting that the length distribution of gene models at the levels of genes, CDS, exons and introns showed a similar trend when compared to those of YFP (GCF_000442215), EFP (GCA_026225855) and Bottlenose Dolphin (GCF_011762595) (Supplementary Figure S3).In the predicted gene models of YFP and EFP, the BUSCO analysis identified 97.5 % and 97.6% complete conserved single copy mammalian genes (odb10), respectively (Table 1 and Figure 1E).In total, 22,263 (96.21%) gene models in the YFP genome and 22,224 (96.20%) gene models in the EFP genome were annotated in at least one database (NR, SwissProt, KEGG, KOG, TrEMBL, Gene Ontology and InterPro) (Table 1, Figure 1F), whereas 71.22% (16,480) of YFP genes and 71.24% (16,457) of EFP genes are annotated in five functional databases (NR, SwissProt, KEGG, KOG and InterPro) (Supplementary Table S15 and Figure S4-S5).Finally, 20, 589 (88.98%) and 20, 613 (89.23%) could be transcriptionally detected by the 24 RNA-seq datasets (Figure 1F).

Analysis of centromere related genes
The centromere is an important functional structure of eukaryotic chromosomes, and plays an important role in ensuring the correct segregation of chromosomes during cell division.The mystery of the evolution of centromere structure among different species has not been fully revealed due to the challenge of assembling highly repetitive sequences [23].In this study, we detected repeat monomers in the YFP and EFP genomes that may constitute the centromere (Figure 2A, Figure 2B and Supplementary Table S16-S17).In total, 235 and 237 genes were identified in the YFP and EFP candidate centromere region, respectively, while 56 and 20 genes of YFP and EFP were discovered in the newly identified centromere regions.We further compared the genes located in the centromere and non-centromere regions in the genomes of YFP and EFP, respectively, and found that there were no significant differences in the expression patterns of these genes in different transcripts (Figure 3A, Figure 3B and Supplementary Table S18-S19).Additionally, we analyzed the functional enrichment of genes located in the centromere regions of the two genomes (Supplementary Figure S6-S7).These genes were mainly involved in localization, locomotion, transcription receptor activity and cytoskeletal motor activity in GO enrichment (Figure 3C, Figure 3D and Supplementary Table S20-S23).
Here, the genes coding for the mutated regions of the YFP and EFP are widely enriched in immune-related pathways, which may be closely related to their different habitats in freshwater and seawater, respectively.The microbial categories and pathogenicity of freshwater and seawater environments differ greatly, and the effects of pathogenic microorganisms on the organisms are also different.Therefore, the YFP and EFP will undergo adaptive evolution in order to adapt to the pathogen stresses of the two ecological environments, respectively, in freshwater and seawater.
Alterations in a considerable number of immune-related genes may be important to facilitate the adaptation of YFP and EFP to freshwater and seawater habitats, respectively.Specifically, by comparing the assembly results of the YFP and EFP versions, it was found that YFP assembled 5480 new genes, while EFP assembled 1453 new genes compared to the previous version of the assembly (Table 1 and Supplementary Table S26).These genes were expressed in all 24 samples (Supplementary Figure S8-S9 and Table S27-S28).GO functional enrichment analysis discovered that these genes are mainly enriched in cellular process, metabolic process, cellular anatomical entity, binding and catalytic activity (Figure 4D, Figure 4E, Supplementary Table S29-S32 and Figure S10-S11).

Phylogeny and synteny analysis
Gene family analysis was performed on 506,098 protein coding sequences from ten cetaceans and sixteen terrestrial mammals, and clustered into 22,196 gene families, which including 594 species-specific genes in YFP and EFP (Supplementary Table S7 and Table S33).A total of 2,161 single-copy orthologous genes were aligned using MAFFT (Supplementary Figure S12).Our analysis had revealed that the divergence time between the YFP and EFP ranges from 0.5 to 1.1 million years ago (Figure 5A, Figure 5B and Supplementary Figure S13), which is the first estimate at the molecular level since their classification as two distinct subspecies.Synteny analysis demonstrated that YFP displayed a greater level of conservation than EFP (Figure 1B).We observed similar patterns in the distribution of gene frequency, gene density, TE density, and GC density between the two genomes, with most of their chromosomes being aligned with each other.Notably, certain chromosomes in the YFP genome were found to match multiple chromosomes in the EFP genome, indicating that chromosomal rearrangements occurred in both genomes after their speciation.

Gene family and positive selection analysis
We used CAFÉv4.0 to analyze the evolution of gene families based on orthologous clusters of protein coding sequences from twenty-six mammals.When comparing the genomes of YFP and EFP with their last common ancestor, it was found that 843 gene families expanded while 98 contracted (Figure 5A and Figure 5B).Out of the 215 expanded gene families identified in the YFP and EFP lineage, a total of 2,674 genes were found to be involved with statistical significance (P<0.05)(Supplementary Table S34).We observed an expansion for genes significantly enriched in several KEGG pathways, including "antigen processing and presentation", "intestinal immune network for IgA production", "oxidative phosphorylation" and "calcium signaling pathway" (Figure 5C).The significantly enriched GO terms, including "ferric iron binding", "iron ion transport", "riboflavin biosynthetic process", "tetrahydrofolate biosynthetic process" and "cytochrome-c oxidase activity", were also expanded (Figure 5D).We postulated that these genes played a crucial role in regulating osmotic pressure, enhancing immune resistance and facilitating hypoxic tolerance in finless porpoises, thereby reflecting potential mechanisms of adaptation to the aquatic environment.Further investigations are required to elucidate the specific functions of these gene families and their potential significance in the biology of finless porpoises.
The Codeml program in PAML with a branch-site model was employed for selective pressure analyses based on orthologous clusters of 10 cetaceans, including Yangtze finless porpoise, East Asian finless porpoise, Bottlenose dolphin, Killer whale, Yangtze River dolphin, Sperm whale, Minke whale, Bowhead whale, Beluga whale, Chinese white dolphin.We identified 41 positively selected genes (PSGs) in the YFP lineage, which were functionally enriched in "RNA degradation", "nucleotide excision repair", "DNA replication", "mismatch repair", and "homologous recombination" (P<0.05)(Figure 5E and Supplementary Table S35).The evolution of DNA damage repair pathways implied the existence of additional triggers for genomic instability in the Yangtze River, including human activities such as wading projects, dredging and quarrying.Interestingly, 44 PSGs in the EFP lineage were involved in "sodium-dependent phosphate transport", "sodium symporter activity", "aldosterone-regulated sodium reabsorption", and "calcium signaling pathway" (Figure 5E and Supplementary Table S36).Among these PSGs, six were potentially associated with the adaptation of EFP to high osmolarity environment, including Na(+)/H(+) exchange regulatory cofactor (NHE-RF2) and sodium-dependent phosphate cotransporter (SLC34).These indicated that Yangtze and East Asian finless porpoises may possess distinct adaptation strategies to their aquatic environment, which warrant further investigation.

Sample collection, DNA extraction, and sequencing
We collected an adult dead female YFP sample from Lianzhou Lake, Anqing City, Anhui Province, China (N30°15ʹ32ʺ, E116°54ʹ38ʺ) in 2021 and a dead juvenile male EFP sample from the Yellow Sea near Lianyungang City, Jiangsu Province, China (N34°55ʹ27ʺ, E119°11ʹ37ʺ) in 2019 for sequencing (Figure 1A).No ethical considerations were taken into account in this study.DNA were extracted from muscle tissues following the phenol/chloroform DNA extraction method.DNA extracted from the YFP was utilized to construct PacBio HiFi, Hi-C, and Oxford Nanopore Technologies (ONT) libraries.DNA extracted from an EFP was utilized to construct an ONT library.
According to the manufacturer's instructions (QIAGEN, Germany), a PacBio HiFi library was constructed using a QIAGEN Blood & Cell Culture DNA Midi Kit and subsequently sequenced on the PacBio Sequel II system in circular consensus sequence (CCS) mode.A Hi-C library was generated using the Mbo I restriction enzyme and subsequently sequenced on BGI MGISEQ platform.To generate and sequence ONT libraries, we isolated genomic DNA using the CTAB method, selected fragments exceeding 5 kb in size with the SageHLS HMW library system (Sage Science), processed the DNA with the Ligation sequencing 1D kit (SQK-LSK109, Oxford Nanopore Technologies, Oxford, UK), and subsequently sequenced the ONT libraries on a PromethION platform (Oxford Nanopore Technologies) at the BGI (Wuhan, China).
Various methods were employed to evaluate the quality of the gap-free genome assemblies, including contiguity, correctness, and completeness.First, we calculated the length metrics of genomic sequences to evaluate contiguity and subsequently used Merqury (v1.3) [32] with k-mer set to 21 to assess the accuracy.Second, Benchmarking Universal Single-Copy Orthologs (BUSCO) [33] evaluation were conducted to assess the completeness.Third, we also mapped PacBio HiFi, ONT and RNA-seq data into the genome assemblies using Minimap2 [34] and Hisat2 (v2.1.0)[35] to assess the completeness.In addition, we utilized quartet pipeline [36] to search for telomere repeat sequences and centromere region in YFP and EFP.
Genes were categorized as newly assembled if the gene region of first draft genome assembly exhibited a deletion of at least 50 bp and a minimum overlap of 30% within that region.The package ANNOVAR (v 2013-06-21) [52] was utilized to annotate the variations.

Identification of new assembled genes
The software Syri (v1.6.3) was employed to detect structural variations between the v2.0 genome assembly and the previously published genome assembly of finless porpoises.A gene was classified as newly assembled if the previously published genome assembly exhibited a deletion of at least 50 bp and the gene region had a minimum overlap of 30% with that region.

Gene family expansion and contraction analysis
Protein sequences of YFP, EFP and 24 published mammals were used to search homologs.Based on the gene families clustered by OrthoFinder, the CAFÉ (v4.0) [60] software was used to perform expansion and contraction analyses in branch of finless porpoises.Random birth and death models were employed to study gains and losses of gene families in a user-specified phylogeny.The global parameter λ, which describes both the gene birth (λ) and death (μ = −λ) rate for gene families in all branches of the tree, was estimated using maximum likelihood.Then the P-value was calculated for each gene family, and P-value ≤0.01 was defined as a "significantly expanded and contracted gene family".KEGG and GO enrichment analyses were conducted among these significantly expanded and contracted gene family.

Gene positive analysis
Protein sequences of two finless porpoise and other 8 published cetaceans were used to identify single copy orthologs with OrthoFinder.Then Ka/Ks ratios for these single copy orthologs were calculated by following steps.Firstly, global alignment among these single copy orthologs was executed by PRANK and then filtered the alignment with Gblocks.Finally, Ka/Ks ratios on different branches was calculated by Codeml in the PAML package [57] with the free-ratio model.Genes that showed values of Ka/Ks higher than 1 along the branch, leading to finless porpoise were reanalyzed using the codon-based branch site tests implemented in PAML (PAML, RRID:SCR_014932).The branch site model allowed ω to vary both among sites in the protein and across branches, and it was used to detect episodic positive selection.

Conclusion
The availability of reliable chromosome-level genome assembly provides remarkable improvements in identifying genes, characterizing genomic regions and performing comparative genomic analyses.In the present study, we assembled two telomere-to-telomere and gap-free Yangtze finless porpoise and the East Asian finless porpoise genomes by combining PacBio long reads, Hi-C and short-read sequencing technologies.The new assemblies have higher contiguity and completeness, as well as more complete single-copy BUSCO genes with fewer fragmented or missing genes than the first drafts.Genome synteny analysis revealed a robust collinear relationship between the Yangtze finless porpoise and the East Asian finless porpoise.Reconstructing ancestral chromosomes enabled the identification of chromosomal rearrangement events in the finless porpoise.The reconstructed phylogeny determined that the YFP and EFP constitute a clade and diverged approximately 0.5-1.1 million years ago (Ma).Gene family expansion analysis revealed significantly enriched pathways and GO terms associated with the regulation of osmotic pressure, immune resistance, and hypoxic tolerance.Selection pressure analysis identified genes associated with DNA damage repair in the YFP and high salt tolerance in the EFP, respectively.The acquisition of the centromere, telomere, and associated genes can serve as valuable resources for comprehensively understanding chromosome stability, undesired recombination, repair mechanisms, and evolutionary processes.Overall, this is the most continuous genome assembly to date, with chromosome-scale contigs and no gaps.This study will lay a foundation for population genomics studies at the whole genome level, and deepen the scientific issues related to population conservation and adaptation mechanisms.This study GCF_000442215 This study GCA_026225855 Note: "v2.0" indicated the new genome assembly and annotation generated in this study; "v1.0" 538 indicated previously published first draft of genome assembly and annotation."C" indicated the 539 percentage of complete BUSCO evaluation."pairs/single": "pairs" indicated that the telomeres were 540 found at both ends of the chromosomes; "single" indicated that the telomeres were found only at 541 the one end of chromosomes."New-found genes" indicated that the genes were predicted in the 542 extra sequence segments from the current assembly and were annotated in the current assembly.The number of Indels within 1Mb window size

Figure 1 .
Figure 1.Genome analysis and quality assessment of Yangtze and East Asian finless porpoises v2.0.A: Location distribution and sampling site of the Yangtze and East Asian finless porpoises.B: Synteny analysis of Yangtze finless porpoise and East Asian finless porpoise v2.0 genomes: a) chromosomes length; b) frequency of genes; c) density of genes; d) repeat density; e) GC density and f) syntenic blocks between Yangtze and East Asian finless porpoises.C: Heat map displaying Hi-C interactions of Yangtze finless porpoises v2.0.D: Heat map displaying Hi-C interactions of East Asian finless porpoises v2.0.E: BUSCO assessments exhibiting proportions classified as Complete and single-copy (S, blue), Complete and duplicated (D, green), Fragmented (F, yellow), and Missing (M, red) categories.F: Proportions of genes that could be functionally annotated and transcriptionally detected in Yangtze and East Asian finless porpoises v2.0.

Figure 2 .
Figure 2. T2T-resolved assembly of Yangtze and East Asian finless porpoises v2.0.A: Structure of T2T and gap-free chromosomes in Yangtze finless porpoises v2.0.All 21+X chromosomes of Yangtze finless porpoises v2.0 are drawn to scale and the ruler indicates chromosome length.Triangles indicate the presence of telomere sequence repeats.Circles represent the locations of centromeric regions.The gap positions in the Yangtze finless porpoises v1.0 genome assembly are marked with squares corresponding to the right side of the chromosome in the Yangtze finless porpoises v2.0 genome assembly.B: Structure of T2T and gap-free chromosomes in East Asian finless porpoises v2.0.All 21+X/Y chromosomes of East Asian finless porpoises v2.0 are drawn to scale and the ruler indicates chromosome length.Triangles indicate the presence of telomere sequence repeats.Circles represent the locations of centromeric regions.The gap positions in the East Asian finless porpoises v1.0 genome assembly are marked with squares corresponding to the right side of the chromosome in the East Asian finless porpoises v2.0 genome assembly.

Figure 3 .
Figure 3. Expression patterns and functional enrichment of genes in the centromere region.A: Heatmaps of the gene expression levels in centromere region and non-centromere region of Yangtze finless porpoises v2.0.B: Heatmaps of the gene expression levels in centromere region and non-centromere region of East Asian finless porpoises v2.0.C: GO enrichment analysis of genes in centromere region of Yangtze finless porpoises v2.0.D: GO enrichment analysis of genes in centromere region of East Asian finless porpoises v2.0.

Figure 4 .
Figure 4. Structure variant between Yangtze and East Asian finless porpoises v2.0 genome assembly with Yangtze finless porpoises v2.0 genome assembly for reference.A: The density plot of SNPs between Yangtze and East Asian finless porpoises v2.0 genome assembly.B: The density plot of Indels between Yangtze and East Asian finless porpoises v2.0 genome assembly.C: KEGG enrichment analysis of genes located in SNP and Indel region of Yangtze finless porpoises v2.0 genome assembly.Blue bar charts indicate the genes located in Indel region; while red bar charts indicate the genes located in the SNP region.D: GO enrichment of new-found genes of Yangtze finless porpoise v2.0.E: GO enrichment of new-found genes of East Asian finless porpoise v2.0.

Figure 5 .
Figure 5. Genome evolution of Yangtze and East Asian finless porpoises.A: Divergence time between Yangtze finless porpoises and East Asian finless porpoises, and number of expanded and contracted gene families.green and red numbers indicate gene family expansions and contractions, respectively.MRCA: Most Recent Common Ancestor.Ma: Million years ago.B: A comparison of gene families associated with orthologs and paralogs in Yangtze finless porpoises and East Asian finless porpoises and other 24 mammal species.C: Significant KEGG and GO enrichment of expanded gene families in Yangtze and East Asian finless porpoise lineage.D: KEGG enrichment analysis of positively selected genes in Yangtze finless porpoises (Neaa) and East Asian finless porpoise (Neas), respectively.

Figure 1
Figure 1 Click here to access/download;Figure;Figure 1.pdf

Figure 2
Figure 2 Click here to access/download;Figure;Figure 2.pdf

Figure 3
Figure 3 Click here to access/download;Figure;Figure 3.pdf n a v ir u s d is e a s e − C O V ID − 1 9 H e rp e s s im p le x v ir u s 1 in fe c ti o n C a lc iu m s ig n a lin g p a th w a y N e c ro p to s is E C M − re c e p to r in te ra c ti o n A m o e b ia s is N F − k a p p a B s ig n a lin g p a th w a y S ta p h y lo c o c c u s a u re u s in fe c ti o n P ro te in d ig e s ti o n a n d a b s o rp ti o n O lfa c to ry tr a n s d u c ti o n V ir a l m y o c a rd it is R h e u m a to id a rt h ri ti s C o m p le m e n t a n d c o a g u la ti o n c a s c a d e s L e is h m a n ia s is A u to im m u n e th y ro id d is e a s e L e g io n e llo s is F e rr o p to s is M in e ra l a b s o rp ti o n S a liv a ry s e c re ti o n A llo g ra ft re je c ti o n A n ti g e n p ro c e s s in g a n d p re s e n ta ti o n A s th m a In te s ti n a l im m u n e n e tw o rk fo r Ig A p ro d u c ti o n M a la ri a C h o le s te ro l m e ta b o lis m G ly c e ro lip id m e ta b o lis m A ra c h id o n ic a c id m e ta b o lis m T y p e I d ia b e te s m e lli tu s G ra ft − ve rs u s − h o s t d is e a s e P ri m a ry im m u n o d e fi c ie n c process involved in interspecies interaction between organisms D E

Figure 4
Figure 4 Click here to access/download;Figure;Figure 4.pdf

Figure 5
Figure 5 Click here to access/download;Figure;Figure 5.pdf